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Caldanaerobacter yonseiensis is a strictly anaerobic, thermophilic, spore-forming bacterium, which was isolated from a geother- 
mal hot stream in Indonesia. This bacterium utilizes xylose and produces a variety of proteases. Here, we report the draft ge- 
nome sequence of C. yonseiensis, which reveals insights into the pentose phosphate pathway and protein degradation metabo- 
lism in thermophilic microorganisms. 
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Caldanaerobacter yonseiensis was isolated from a geothermal 
hot stream at Sileri, Java Island, in Indonesia (1). Previously, 
this strain was classified as a new species of the genus Thermoan- 
aerobacter, designated Thermoanaerobacter yonseiensis, but reas- 
signed as a novel genus and species, Caldanaerobacter yonseiensis 
(2). This strain is a strictly anaerobic, extremely thermophilic, 
spore-forming, and xylose-utilizing bacterium. C. yonseiensis can 
grow in the temperature range of 50 to 85°C, with an optimum at 
75°C, and a pH range for growth of 4.5 to 9.0, with an optimum at 
pH 6.5 (1). These properties indicated that C. yonseiensis might be 
a good source for the isolation of thermostable and acidophilic 
D-xylose isomerase, which is suitable for the production of high 
fructose corn syrup in the food industry (3-5). Moreover, C. yon- 
seiensis produces a novel subtilisin-like protease, thermicin, that 
showed maximum proteolytic activity at 92.5°C and pH 9.0, indi- 
cating that this enzyme will be applicable for the hydrolysis of 
collagens at elevated temperatures without contamination (6). 
Thus, C. yonseiensis is of great interest as a potential industrial 
microorganism, which led us to sequence the whole genome of 
this microorganism. 

The genome of C. yonseiensis was sequenced via the Ion Tor- 
rent personal genome machine (PGM) sequencer system using a 
316 D sequencing chip (7). The sequence was assembled using 
MIRA 3.4.0. The assembled genome consists of 102 contigs 
(>500 bp), with a genome size of 2,700,546 bp at 34.96-fold cov- 
erage, with a G+C content of 36.6%. The assembled contigs were 
submitted to the RAST annotation server (http://rast.nmpdr 
.org/) for subsystem classification and functional annotation. 
There were a total of 2,905 predicted protein-coding sequences 
(CDS), with 49% assigned to recognizable functional genes. While 
the 16S rRNA gene sequencing of C. yonseiensis showed the simi- 
larity of this bacterium to C. tengcongensis strain MB4 (99% se- 



quence identity) and C. keratinophilus strain 2KXI (99% sequence 
identity), RAST analysis suggested that C. tengcongensis MB4 was 
actually the closet neighbor in terms of sequence similarity. 

Consequently, we confirmed 13 genes from the draft se- 
quence that encode proteins related to xylose utilization, in- 
cluding two xylose isomerase-coding genes that are involved in 
xylose metabolism in the C. yonseiensis genome. Also identified 
were genes that encode glucose-6-phosphate 1 -dehydrogenase 
and 6-phosphogluconate dehydrogenase, which are among the 
key enzymes for the generation of reducing power (NADPH) in 
the pentose phosphate pathway. Another feature of C. yonseiensis 
is the ability to hydrolyze glycine- and proline-rich collagens (6). 
Based on our data, a gene encoding a subtilicin-like serine pro- 
tease, designated thermicin, was identified, together with another 
four genes encoding serine proteases. Moreover, four genes en- 
coding cysteine proteases and 27 genes encoding a variety of pro- 
teases were identified. Given these results, the draft genome se- 
quence of C. yonseiensis will not only provide insights into the 
metabolism of monomeric and polymeric carbohydrates under 
extremely thermophilic conditions, but it will also encourage fur- 
ther study of a variety of proteases potentially applicable for their 
use in detergents and in the leather and textile industries. 

Nucleotide sequence accession numbers. This whole-genome 
shotgun project has been deposited at DDBJ/EMBL/GenBank un- 
der the accession number AXDC00000000. The version described 
in this paper is the first version, AXDC01000000. 
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